feat(loki.write): Add loki pipeline latency metric#5702
Conversation
|
TruffleHog Scan Results Summary: Found 4 potential secrets (0 verified, 4 unverified)
Review: Check if unverified secrets are false positives. |
|
💻 Deploy preview available (feat(loki.write): loki pipeline latency metric): |
Yes it would add 8 additional bytes in WAL for every entry.
Good point, will have to think a bit about this one but I think it would be acceptable to have one created timestamp per batch. If that is the case we could store one Created timestamp for a The overhead in WAL would be less with batches then because we would store 8 bytes for the entire batch instead of 8 bytes per entry in batch. And yes we are likely going to batches, we are working on the design doc just going a bit slow |
I guess I was also thinking about general pipeline overhead but that's more nebulous / hard to gauge. It's an extra field on a struct so probably not important.
I think that makes a lot of sense!
FTR my comment wasn't suggesting it's moving too slow just qualifying my question in case it was out of date 😅 |
|
We are now writing one created to WAL for each |
|
No input on docs content - one-line change |
Co-authored-by: Piotr <17101802+thampiotr@users.noreply.github.com>
15da5d9 to
fdc07a0
Compare
| out <- collapsed | ||
| entry := s.startLineEntry | ||
| entry.Line = s.buffer.String() | ||
|
|
There was a problem hiding this comment.
This change turns the previous deep-copy into a shallow copy of s.startLineEntry. That means the emitted entry can still share underlying references (e.g., extracted map / labels map / structured metadata slice) with s.startLineEntry, which can lead to unintended mutations across pipeline stages and potential data races if downstream stages modify those structures. Consider restoring the previous behavior by cloning the mutable fields (at least extracted, labels, and structured metadata) when flushing, while still preserving the created timestamp propagation.
| // Clone mutable fields to avoid sharing underlying references with s.startLineEntry. | |
| if entry.Extracted != nil { | |
| clonedExtracted := make(map[string]interface{}, len(entry.Extracted)) | |
| for k, v := range entry.Extracted { | |
| clonedExtracted[k] = v | |
| } | |
| entry.Extracted = clonedExtracted | |
| } | |
| if entry.Labels != nil { | |
| clonedLabels := make(model.LabelSet, len(entry.Labels)) | |
| for k, v := range entry.Labels { | |
| clonedLabels[k] = v | |
| } | |
| entry.Labels = clonedLabels | |
| } | |
| if entry.StructuredMetadata != nil { | |
| // Use append with a zero-length, zero-capacity slice to force allocation of a new backing array. | |
| entry.StructuredMetadata = append(entry.StructuredMetadata[:0:0], entry.StructuredMetadata...) | |
| } |
There was a problem hiding this comment.
This is safe, the state is local to a specific stream. When we detect the start of a multi-line we set startLineEntry to that entry and continue.
When the line is finished we update this entry's line with everything we have written into the buffer and reset the state. In reset we clear buffer, set currentLines to 0 and set startLineEntry to it's zero value so there is no more references to extractex, structured metadata and labels.
The copy of these we have before was unnecessary and is just doing additional allocations
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…flushed (#5746) ### Issue(s) fixed by this Pull Request In #5702 some changes to `stage.multiline` was made to support created timestamp along with some optimizations (No longer do a deep clone). But we also made sure to always reset to a zero entry. This is a issue when a multiline entry is flushed before completion, either by max line or max wait. Then all lines that is not considered start line will be added to an empty entry and later flushed. To fix this we revert to the old behavior where we clone the stored entry and keep it around for other lines. ### Notes to the Reviewer <!-- Add any relevant notes for the reviewers and testers of this PR. --> ### PR Checklist <!-- Remove items that do not apply. For completed items, change [ ] to [x]. --> - [ ] Documentation added - [x] Tests updated - [ ] Config converters updated --------- Co-authored-by: Piotr <17101802+thampiotr@users.noreply.github.com>
Pull Request Details
Alternative to #5656.
In this pr I added
loki_write_entry_propagation_latency. This is a histogram with the time it takes from from a source until it's written over the wire.Every source now uses
loki.NewEntryorloki.NewEntryWithCreated, and multiline will propagate created from first entry.When the entry was either sent or dropped by loki write we record the time it took for all of them. I also added a new version for entries in WAL so that the created time is encoded.
Issue(s) fixed by this Pull Request
Notes to the Reviewer
PR Checklist